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Abstract 

/ review the phylogenetic implications of eight duplications of nuclear genes encoding isozymes in Clarkia 
(Onagraceae). These include ADH , cytosolic PGI, and both plastid and cytosolic isozymes of PGM, 6PGD, and 
TPI. The PGI duplication has been studied intensively from biochemical and genetic standpoints. Recent results 
have identified two levels of regulation that operate in species with this duplication, one that reduces cytosolic 
PGI activity to the level characteristic oj species without the duplication (dosage compensation) , and the second 
that results in differential accumulation of the products of the duplicate genes. These factors appear to reduce 
the impact oj the duplication on metabolic junction. I also describe our recent cloning and sequencing of two 
genes encoding PGI obtained from a genomic DNA library of C. unguiculata, a species with the duplication. The 
two genes encode proteins oj 548 and 543 amino acids, respectively, and their predicted amino acid sequences 
are 58% homologous. They show 65% homology to a previously published partial amino acid sequence of pig 
muscle PGI. Both genes lack introns. Ihe two genes are the first nuclear genes sequenced in wild plants. They 
are being studied as part of a research program on gene evolution and the application of nuclear gene sequences 
for phylogenetic reconstruction in higher plants. 


Questions about phylogeny have the form, "Is 
A more closely related to B than to C?” For flow¬ 
ering plants, the best phylogenies are thought to 
take into account the "maximum number of at¬ 
tributes possible” (Davis & Hey wood, 1967: 485), 


with evidence from morphology, cytology, chem¬ 
istry, reproductive compatibility, and other fields 
somehow combined. However, accurate phyloge¬ 
netic reconstruction is more often a goal than an 
achievement because of problems brought about 


1 This and the following three papers comprise the proceedings of the Missouri Botanical Garden s 34th 
Annual Systematics Symposium—Macromolecular Approaches to Phylogeny. The symposium took place in St. 
Louis, Missouri on October 9 and 10, 1987. 

2 The molecular genetics results (library construction , gene cloning and sequencing) described in this report 
were obtained in my lab by Dr. R. C. Tait , Debbie Laudencia, and Byron Froman. The molecular genetics 
research was supported by National Science Foundation grant BSR 86-07054 and l SDA 86-CR(.R-1-2139. 

3 Department of Genetics, l niversity oj California, Davis, California 95616, U.S.A. 
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by character convergence, functional and devel- ecological analyses by providing evidence that A 

opmental correlations, and unequal rates of evo- indeed derived from B and not from (-. 

lution in different lineages. The essential difficulty In this paper, I review genetic and biochemical 
is that little or nothing is known about how genetic studies from my laboratory on gene duplications 


changes that affect developmental processes 
in differences in character expression. 


ult 


in Clarkia with emphasis on their phylogenetic 
implications. In addition, I describe very recent 


The consequence is that no present procedure studies in which we have cloned and sequenced 

can translate the extent of morphological diver- several genes encoding the glycolytic enzyme phos- 

gence into a measure of the closeness of phylo- phoglucose isomerase from genomic libraries of 

genetic relationship. I believe the way out of this Clarkia DNA. One purpose of these studies is to 

impasse is to utilize a new source of evidence to infer correct phylogenetic relationships in this vvell- 

assess phylogenetic relationships. The data of tnor- studied plant genus. When the beginning and end 

phology, the traditional source of information about points of species’ genealogies are identified, we can 

phylogeny, should he viewed as relevant to studies ask about the steps in between, 

of plant development. 


There is good reason to believe that information 
derived directly or indirectly from the structure 
and sequence of protein and DNA can be used to 


Background 

Previous to our studies and, indeed, making 


settle many phylogenetic questions. Interestingly, them appropriate, were the intensive investigations 

in this context, the molecular data are self-sufficient by Professor Harlan Lewis and his students and 

in that their usefulness does not depend on con- colleagues in the 1950s and 1960s (Lewis, 1953, 

cordance with other lines of phenotypic evidence. 1962, 1973; Lewis & Lewis, 1955; Le vvis & Ka- 


For example, certain types of changes, particularly ven, 1958). They correlated evidence from field 

duplications of nuclear penes encoding enzymes studies, morphology, and a major program of hy- 

(Gottlieh & Weeden, 1979; Gottlieb, 1983; Odrzy- bridization and cytogenetical analysis. Clarkia was 

koski & Gottlieb, 1984) and large inversions of found to comprise at least 43 species, 33 being 

the chloroplast genome (Jansen & Palmer, 198/) diploid. I he diploid species were distinguished by 

appear to occur only once within a lineage. Thus substantial amounts of chromosomal repatterning 

taxa that now possess them probably descended in addition to aneuploidy. 1 he extent of morpho- 

from a single common ancestor and can be con- logical divergence varied from a difference in a 

sidered monophyletic without regard to their pres- single character between some pairs of species to 


ent morphological and cytological divergence. 


differences in entire suites of traits that might serve 


In addition to phylogenetic inferences made on as evidence of generic distinction in other plant 

the basis of unique genetic and molecular traits, groups. The degree of morphological resemblance 

cladograms based on shared derived mutations or was frequently not concordant with the amount of 

the extent of overall similarity can be constructed chromosomal rearrangement. Nevertheless it was 

by comparing nucleotide sequences of genes or the possible to discern meaningful phylogenetic pat- 


pattern of fragments cut from homologous terns among the diploid species, and they were 

DNAsby restriction endonucleases. The increasing assigned to seven taxonomic sections (Lewis & 

availability of molecular data suggests that biosys- Lewis, 1955). Allopolyploid species linked several 

tematics no longer has to be considered an “un- sections so that as a whole the genus was considered 


ending synthesis'’ (Constance, 19(>4). 

Phylogenetic relationships can now be deter- 


a natural unit. 

Lewis formulated an elegant model of speciation 


mined accurately and reliably at many taxonomic to account for these relationships. The critical lea- 
levels. When this is done, the phylogeny can be tures of this model included the following: (1) species 
used as a framework to ask important questions in were regarded as progenitor and derivative and not 
other areas of biology. For example, how the at- as siblings; (2) a new species differed from its parent 


tributes of species reflect both genetic legacy and by gross chromosomal rearrangements and some- 

selected and other changes since their origins, how times by a change in basic number; (3) the spe- 

genetic changes lead to specific modifications of ciation process was rapid and abrupt; (4) speciation 

ontogeny that result in new characters, and how was independent of the evolution of new adapta- 

and whether these new traits facilitate adaptation tions and therefore was largely fortuitous; (5) spe- 

to different habitats. From this perspective, phy- ciation, in general, occurred at the xeric margin 

logeny can begin to inform both developmental and of the distribution of the parent species. 
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Lewis's model and his proposed examples of 
progenitor-derivative species made Clarkia appro¬ 
priate for the first studies carried out in plants that 
applied electrophoretic analysis of enzymes to as¬ 
sess the amount of genetic divergence correlated 

with speciation (Gottlieb, 1973, 1974a). The ratio¬ 
nale behind these studies has been reviewed by 

Crawford (1983, 1985) and by me (Gottlieb, 1977, 
1981, 1986). 

In addition to information about variation (pres¬ 
ence, number, frequency) and divergence of alleles 
at loci coding enzymes, electrophoretic patterns 
provide evidence about the number of isozymes of 
particular enzymes and, thereby, the number of 
coding loci. As more and more species of Clarkia 
were examined, it became apparent that they some¬ 
times differed among themselves or from other 
diploid seed plants in the number of isozymes of 
many particular enzymes. Subsequent genetic stud¬ 
ies revealed that increased isozyme number re¬ 
sulted from duplications of the coding genes (Gott¬ 
lieb, 1977; Gottlieb & Weeden, 1979; Pichersky 
& Gottlieb, 1983). 

Examination of the number of isozymes in a 
broad array of higher plants, including conifers and 
angiosperms, showed that isozyme number was 
highly conserved and depended on the number of 
subcellular compartments in which a particular cat¬ 
alytic reaction occurred (Gottlieb, 1982). For ex¬ 
ample, in diploid plants, enzymes of glycolysis and 
the oxidative pentose phosphate pathway are en¬ 
coded in the nucleus and are generally found as 
two isozymes, one located in the plastids and the 
other in the cytosol. When the number of isozymes 
within a particular compartment is more than one, 
it probably results from duplication of the structural 
gene or, in polyploid plants, from additive expres¬ 
sion of the genes in the several constituent ge¬ 
nomes. Since the conserved number of isozymes 
reflects the metabolic requirements of plant cells, 
a reduced number is not possible because it would 
be lethal. (Failure to observe hands of enzyme 
activity following electrophoresis of plant extracts 
should not be taken as evidence that the enzyme 
is not present in the extract, a common error in 
many surveys of electrophoretic variation in plants 
that report the absence of an expected enzyme 
band as a null allele.) The rules for recognizing 
duplicate isozymes, following electrophoresis of plant 
extracts, have been thoroughly described (Gottlieb, 
1982). It is worth noting again that subcellular 
location furnishes the best criterion for recognizing 
the homology of isozymes from different species, 
and that the rules apply only to enzymes assayed 


with natural in vivo substrates. No regularities have 
been identified in the number of isozymes of en¬ 
zymes such as esterases, phosphatases, and per¬ 
oxidases that are generally assayed with artificial 
substrates. 

Gene Duplication in Clarkia 

The first duplicate isozyme discovered in Clark - 
ia was that of alcohol dehydrogenase (AI)H) in C. 
franciscana (Gottlieb, 1974b). Its absence from 
the closely related C. amoena and C. rubicunda , 
along with the very low genetic identity between 
C. franciscana and these species (Gottlieb, 1973), 
helped reject the hypothesis (Lewis & Raven, 1958) 
that C. franciscana was a recent derivative of C. 
rubicunda. The genetic evidence for duplication 
of ADH in C. franciscana was based on its exhib- 
iting a true-breeding, three-banded electrophoretic 
pattern, whereas similar three-handed patterns in 
the related species resulted from heterozygosity at 
a single locus as evidenced by segregation patterns 
in progeny. Since C. franciscana did not display 
polymorphism for ADH, the duplication model was 
tested by making interspecific hybrids between it 
and C. amoena. The C. amoena plants used were 
homozygous at a single locus for an allele that 
encoded a slow ADH variant. The F, hybrids dis¬ 
played a five-banded pattern that could only have 
resulted from the dimeric associations of three dif¬ 
ferent polypeptides and, consequently, they must 
have possessed three genes (Gottlieb, 1974b). The 
ADH duplication was the second duplication of a 
gene encoding an enzyme discovered in plants. The 
first, in maize, was also an ADH (Schwartz & Endo, 

1966). 

Seven additional duplications of genes in Clarkia 
have since been described and, for each, the taxo¬ 
nomic distribution within the genus has been de¬ 
termined (Table 1). These duplications are cytosolic 
phosphoglucose isomerase (PGI) (Gottlieb, 1977; 
Gottlieb & Weeden, 1979), plastid and cytosolic 
triose phosphate isomerase (TPI) (Pichersky & 
Gottlieb, 1983), plastid and cytosolic 6-phospho- 
gluconate dehydrogenase (6PGD) (Odrzykoski & 
Gottlieb, 1984), and plastid and cytosolic phos- 
phoglucomutase (PGM) (Soltis et al., 1987). De¬ 
tailed information about them is available in the 
individual reports. Five of the seven duplications 
(plastid and cytosolic 6PGD, plastid and cytosolic 
TPI, and plastid PGM) are present in species of 
all diploid sections of Clarkia ( Fable 1), suggesting 
they are at least as old as the genus. But only the 
duplicated plastid TPI was found in every species. 
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PLASTID AND CYTOSOLIC 6PGD 

Four species of Clarkia appear to lack one or 
both 6PCD duplications (Odrzykoski & Gottlieb, 
1984). Clarkia rostrata and C. epilobioides have 
a single plastid isozyme and a single cytosolic one 
and, consequently, lost both duplications. Clarkia 
lewisii and C. cylindrica have duplicated plastid 
6PGDs but only a single cytosolic 6PGD ( Fable 
1). The four species have been assigned to sect. 
Peripetasma , with the morphologically similar and 
crossable (Davis, 1970) C. rostrata , C. lewisii , 
and C. cylindrica to one subsection and the dis¬ 
tinctive and highly self-pollinating C. epilobioides 
to a monotypic subsection (Lewis & Lewis, 1955). 
The close relationship of the former three species 
suggested that the loss of the duplicated cytosolic 
6PGD occurred in their common ancestor and was 
subsequently followed in C. rostrata by an addi¬ 
tional mutation or chromosomal deletion that si¬ 
lenced a duplicated gene encoding a plastid 6PGD. 
Since C. epilobioides also lacked both duplications, 
it seemed reasonable to suggest that it was closely 
related, although it was not possible to decide if 
the loss of its plastid 6PGD duplication was inde¬ 
pendent of the loss in C. rostrata. 

The matter was settled by a restriction endonu¬ 
clease analysis of chloroplast DNA carried out on 
all the species in this section, which revealed that 
C. rostrata and C. epilobioides were sister species 
and that C. lewisii and C. cylindrica comprised 
a second pair of sister species (Sytsma & Gottlieb, 
1986a). The chloroplast DNA study also showed 
that the two pairs of species share a common ances¬ 
tor well removed from the other species of the 
section. Thus, even though C. rostrata is not mor¬ 
phologically similar to C. epilobioides and was 
placed in a different subsection, the two species 
have a close genealogical relationship. Since this 
phylogenetic inference was based on evidence from 
both nuclear genes and chloroplast DNA, it is par¬ 
ticularly strong. 


PLASTID AND CYTOSOLIC TPI 


Both TPI duplications appear to be present 
throughout Clarkia ( Fable 1), although some un¬ 
certainty remains in regard to the cytosolic TPI in 
sect. Eucharidium for which the genetic analysis 
is incomplete (Pichersky & Gottlieb, 1985). Elec¬ 
trophoretic studies of TPI have also been carried 
out on a number of species of other genera of 
Onagraceae to ascertain the taxonomic distribution 
of the duplications outside of Clarkia. Since suf¬ 
ficient (or appropriate) material was not available 


to conduct genetic analysis, three criteria had to 
be met to warrant the hypothesis that a given 
species possessed a TPI duplication. The minimum 
number of electromorphs per individual for each 
isozyme had to be at least three (TPI is dimeric), 
the multiple isozymes had to be located in the same 
subcellular compartment, and a side-by-side com¬ 
parison of leaf and pollen extracts had to show the 
same number of cytosolic isozymes (the criteria are 
discussed in detail in Gottlieb, 1983). On the basis 
of satisfying all of these criteria (although sample 
sizes were very limited), the cytosolic TPI dupli¬ 
cation was identified in five of the seven tribes of 
the family, including Jussiaeeae ( Ludwigia ), Fuch- 
sieae (Fuchsia), Hauyeae ( Hauya ), Onagreae 
( Clarkia , Heterogaura , Camissonia , Calylophus , 
Gongylocarpus , and Oenothera ), and Epilobieae 
(Boisduvalia) (Pichersky & Gottlieb, unpubL). The 
presence of the duplication in both Fuchsia and 
Ludwigia , the two most ancient lineages in the 
family (Raven, 1979), suggests its great antiquity. 
In contrast, the plastid TPI duplication was not 
identified outside of Clarkia and must have arisen 
much more recently. Although these results should 
be regarded as exploratory, they point out the 
possibility that certain taxonomically widespread 
duplications may be useful to group genera (and 
eventually families) into monophyletic assemblages. 
However, since the time spans in these comparisons 
are great, it would be appropriate and necessary 
to validate the conclusions by examination of the 
nucleotide sequences of the duplicated genes. 

PLASTID AND CYTOSOLIC PGM 

In contrast to the situation in 6PGD in which 
the absence of duplicated genes could be assigned 
to some type of mutation in common ancestors of 
extant species, the loss of the plastid PGM dupli¬ 
cation (Table 1) in C. concinna and in C. lasse - 
nensis (Soltis et al., 1987) must be regarded as 
independent events in lineages directly ancestral 
to these species but to no others, since the two 
species belong to distantly related sections of Clark¬ 
ia (Lewis & Lewis, 1955). 

The presence of the cytosolic PGM duplication 
in C. arcuata (sect. Rhodanthos) and in all species 
of sections Godetia and Myxocarpa (Table 1) is 
consistent with a taxonomic assignment previously 
made by Lewis & Lewis (1955). They proposed 
(p. 261) that sect. Rhodanthos (then designated 
sect. Primigenia) was "probably directly ances¬ 
tral'' to sect. Godetia and "perhaps’' to sect. 
Myxocarpa. Within sect. Rhodanthos , the rele¬ 
vant lineage is now represented by C. arcuata 
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Table 1. The phylogenetic distribution of duplicate isozymes in diploid species of Clarkia. The PGI data 
are from Gottlieb & Weeden (1979) , the PGM data from Soltis et al. (1987 ), the 6PGD data from Odrzykoski 
& Gottlieb (1984) , and the TPl data from Pichersky & Gottlieb (1983). The numeral 1 indicates the species 
has a single isozyme and the numeral 2 indicates duplicated isozymes. For each enzyme , plastid (PI) and 
cytosolic (Cy) isozymes are indicated. 

Isozyme Number 


PGI PGM 



6PGD 



TPI 


Section _ 











Species 

Cy 

PI 

Cy 



PI 

Cy 

PI 


Cy 

Eucharidium 











C. breweri 

2 

2 

I 



2 

2 

2 


? 

• 

C. concinna 

2 

1 

I 



2 

2 

2 


? 

• 

Fibula 











C. bottae 

2 

2 

1 



2 

2 

2 


2 

Peripetasma 











C. cylindrica 

2 

2 

I 



2 

1 

2 


2 

C. lewisii 

2 

2 

1 



2 

1 

2 


2 

C. epilobioides 

2 

2 

I 



1 

I 

2 


2 

C. rostrata 

1 

2 

1 



1 

1 

2 


2 

C. biloba subsp. australis 

2 

2 

1 



2 

2 

2 


2 

C. dudleyana 

2 

2 

1 



2 

2 

2 


2 

C. lingulata 

2 

2 

1 



2 

2 

2 


2 

C. modest a 

2 

2 

1 



2 

2 

2 


2 

Heterogaura heterandra 

2 

2 

1 



2 

2 

? 

• 


2 

Phaeostoma 











C. xantiana 

2 

2 

1 



2 

2 

2 


2 

C. unguiculata 

2 

2 

1 



2 

2 

2 


2 

Godetia 











C. imbricata 

1 

2 

2 



2 

2 

2 


2 

C. nitens 

I 

2 

2 



2 

2 

2 


2 

C. speciosa subsp. polyantha 

1 

2 

2 



2 

2 

2 


2 

C. williamsonii 

1 

2 

2 



2 

2 

2 


2 

Myxocarpa 











C. mildrediae 

1 

2 

2 



2 

2 

2 


2 

C. virgata 

1 

2 

2 



2 

2 

2 


2 

Rhodanthos 











C. arcuata 

I 

2 

2 



2 

2 

2 


2 

C. lassenensis 

1 

1 

1 



2 

2 

2 


2 

C. amoena subsp. huntiana 

1 

2 

1 



2 

2 

2 


2 

C. franciscana 

1 

2 

1 



2 

2 

2 


2 

C. rubicunda 

1 

2 

1 



2 

2 

2 


2 


which, together with C. lassenensis , was placed in 
a distinct subsection. The other subsection con¬ 
taining diploid species includes C. amoena, C. rubi¬ 
cunda, and C. franciscana, and it then would 

% r 

represent the lineage from which the other four 
sections of Clarkia (Table 1) eventually evolved. 
Alternatively, the cytosolic PGM duplication may 
have had independent origins in C. arcuata and 
sections Cadet in and Myxocarpa. Sequence com¬ 
parisons of the PGM genes will make it possible to 
distinguish these models. 


Regardless of the outcome of such comparisons, 
the taxonomic distribution of the cytosolic PGM 
duplication is independent of the sectional phylog- 
eny suggested (Lewis, 1980) following the discov¬ 
ery of the cytosolic PGI duplication (Gottlieb, 1977; 
Gottlieb & Weeden, 1979), since the two dupli¬ 
cations are not present together in any species 
(Table 1). The PGM evidence suggests that the 
four sections that have the PGI duplication (Table 
1) arose from the lineage within sect. Rhodanthos 
that also gave rise to C. amoena , C. rubicunda , 
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and C, franciscana. It is also an interesting pos- 1987). The inode of origin is important for phy- 
sibility that since the two enzymes catalyze adjacent logenetic reconstructions because chromosomal 
actions in glycolysis and gluconeogenesis (PCI rearrangement is much more likely than unequal 


interconverts fructose-6-phosphate and glucose-6- crossing-over to occur only once lor a particular 
phosphate, and PCM interconverts the latter and chromosome segment in a particular linkage. Link- 
glucose-1-phosphate), there may be metabolic rea- age relationships for the other duplications in 
sons that select against the occurrence ot both Clarkia have not been studied in similar detail. 


duplicated enzymes in the same cytosol. 


although we do know that the duplicate genes en- 


Overall, the genetic and biochemical evidence coding plastid TPIs and one of them and a cytosolic 
from the several gene duplications provides a re- TPI gene also assort independently (Pichersky & 
markably consistent and coherent picture ol the Gottlieb, 1983). 

phylogenetic relationships within Clarkia. The evi- A number of biochemical studies were carried 
dence is also consistent with the recent discovery out to determine how much and what type of di- 
based on restriction endonuclease patterns in chlo- vergence marked the duplicate PGI isozymes. I hrt 


ilk 


roplast DNA that the monotvpic Heterogaura fiel- results are noteworthy, one having to do with tht 


PGI 


ft ^ t v-/ - — 

erandro ('Fable 1) is actually a Clarkia and closely molecular weight of PGI subunits and the other 

related to C. dudleyana (Sytsina & Gottlieb, two with the evolution of regulatory factors that 

1986b). appear to modulate the expression of the duplicate 

PGI genes. 

PGI subunits encoded by the duplicate genes 
have different apparent molecular weights (appar- 
The PGI duplication in Clarkia has been studied ent because the values were obtained from their 

intensively because it was one of the first dupli- electrophoretic mobility on SDS gels), with PGI-2 

cations identified that is present in some but not being 60,400 and PGI-3 59,000, or values closely 

all species of a single genus. Thus, it is possible to similar (Gottlieb & Higgins, 1984a). Species in 

compare duplicate PGI genes and their products sect. Myxocarpa that lack the duplication have 

with their nonduplicate homologues, and tin* com- PGI subunits with molecular weight of 60,400, anil 

parisons can be done in species having a relatively 


PGIs from sections Godetiu and Rhodanthos 
similar genomic background. The example provides weighed in at 59,000. The presence of two mo¬ 
an unusual opportunity to examine the critical early lecular weight forms in species with the duplication 

stages of gene evolution and to test the general and each molecular weight form by itself in species 

model that major changes in gene regulation, struc- without the duplication was unlikely to have come 

ture, and function cannot evolve without the avail- about by chance. The result suggested the novel 

ability of dupl icate sequences. 


possibility that the PGI locus in an ancestal Clarkia 
The PGI duplication characterizes all of the was translocated to different nonhomologous chro- 

species (except C. rostrata) in the morphologically mosomes, that the genes then accumulated mu- 

advanced and diverse sections Eucharidium , Fib- tational changes that encoded different molecular 

ula , Phaeostoma, and Peripctasma, and is absent weight subunits, and that lines carrying the differ- 

from sections Godetia , Myxocarpa , and Rhodan- ent chromosomes eventually hybridized with both 

thos (Table 1). Consequently it identifies a specific PGI genes becoming segregated into a single ge- 

branching point in the phylogeny of Clarkia and nome by a process originally documented in maize 

5 to group the former four sections into a (Burnham, 1962) involving overlapping reciprocal 


monophyletic lineage (Gottlieb & Weeden, 1979; translocations. The scenario seems feasible lor 

Lewis. 1 980). 1’he realignment was effected with- Clarkia , in which species are distinguished by gross 

out having to move any species into or out of any amounts of chromosomal rearrangement, and which 

section (Lewis. 1980). all have a self-compatible breeding system permit- 

Genetic studies revealed that the duplicate PGI ting chromosomal heterozygotes to be made homo¬ 
genes assort independently (Gottlieb, 1977; Gott- zygous and true-breeding by self-pollination. The 

lieb & Weeden, 1979; Weeden & Gottlieb, 1979), merits of this speculation can be directly tested by 

which is thought to mean that they arose by a comparing nucleotide sequences of PGI genes from 

process involving overlapping reciprocal translo- species with and without the duplication (see below), 

cations or insertional translocations rather than After it became apparent that the catalytic prop- 
unequal crossing-over. The relevant arguments were erties of the duplicate and nonduplicate PGIs were 
presented in Gottlieb (1 983). Many other duplicate alike (Higgins & Gottlieb, 1984), studies turned to 
genes in plants also assort independently (Tanksley, questions about increased gene dosage and whether 
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it caused increased levels of cytosolic PCI activity 
and protein. The PGI levels in clarkias with and 
without the duplication were assessed by immu¬ 
nological means using an antiserum specific to cy¬ 
tosolic PGI (i.e., one that does not cross-react with 
plastid PGI). The result was clear-cut. The two 
types of species had the same levels of cytosolic 
PGI activity and protein, suggesting that some form 
of regulation had evolved that “compensated" for 

the duplicated genes (Gottlieb & Higgins, 1984b). 

The activity level proved to be the same as that 
in a number of diploid vegetables, indicating that 
green plants generally maintain a similar PGI level. 
This finding provided an important rationale for the 
evolution of dosage compensation because it re¬ 
stored an activity level characteristic of typical 
diploid plants having a single cytosolic PGI. Thus 
a regulatory mechanism had evolved that reduced 
the impact of the duplication on metabolic function. 

To determine whether the regulation operated 
via metabolic or genetic factors, a series of null 
activity mutants of each duplicate gene was in¬ 
duced by ethyl methanesulfonate (EMS) treatment 
of seedlings of C. xantiana (Jones et al., 1986). 

Metabolic factors would be implicated if lesions 
induced in either gene did not change PGI levels. 
In homozygous state, each mutant completely 
lacked the homodimer activity normally specified 
by the affected gene. The mutants were back- 
crossed to wildtype for five generations, making it 
possible to assign changes in PGI activity directly 
to the mutation and not to unknown factors in the 
background. Immunological analysis revealed that 
they reduced PGI activity in direct proportion to 
the normal contribution of each gene. The homo¬ 
zygous mutants at Pgi2 reduced cytosolic PGI 
activity to 36% of wildtype, and the mutant at 
Pgi3 to 64%. The effects of the mutations at the 
two loci were additive. Thus, Pgi2 nuN 2 nul \ Pgi3 a 3 nul1 
plants synthesized in an F 2 progeny from experi¬ 
mental hybrids between the two mutants exhibited 
only 14 r % of wildtype activity. The double homo¬ 
zygous null was lethal. The results demonstrated 
that PGI activity in plants having the duplication 
is not directly regulated by metabolic factors, war¬ 
ranting the suggestion that the dosage compensa¬ 
tion depends on factors that regulate the levels of 
transcription or translation (Jones et al., 1986). 
Since Pgi3 contributes less than Pgi2 to the total 
cytosolic PGI activity, the regulatory factors ap¬ 
pear to operate to a greater extent on the former 
locus. Thus, two levels of regulation were identified, 
one that reduces cytosolic PGI activity in species 
with the duplication to the level characteristic of 
species without the duplication, and the second that 


results in differential accumulation of the products 
of the duplicate genes. 

The genetic and biochemical analyses of PGI in 
Clarkia identified a number of interesting questions 
that can be answered only with evidence from the 
sequences of the coding genes. For example, in 
terms of phylogenetics, it is necessary to test the 
major hypothesis that the duplication had a unique 
origin, with the consequence that the four sections 
that possess it are monophyletic. A corollary hy¬ 
pothesis is that the origin of the duplication involved 
hybridization between lineages now represented by 
Myxocarpa (which has the higher molecular weight 
subunit) and Godetia/Rhodanthos (with the low 
molecular weight subunit). The hypotheses can be 
tested by comparing the sequences of duplicate 
and nonduplicate PGI genes. On the hypothesis, 
Pgi2 from a species with the duplication should 
be similar to Pgi from Myxocarpa , and Pgi3 from 
the species with the duplication should be similar 
to Pgi from Godetia/Rhodanthos, In other words, 
the duplicate genes should be more similar to genes 
from different species than they are to each other. 

Other questions of interest in a context of evo¬ 
lutionary biology have to do with the extent of PGI 
sequence divergence in species with the duplication 
versus those without the duplication, the extent of 
polymorphism for PGI genes in natural populations 
of Clarkia , and the value of the sequences to 
demonstrate phylogenetic relationships outside of 
Clarkia , particularly among the diverse genera 
included in tribe Onagreae. 

A different set of questions must be answered 
to explain how the cytosolic PGI level is reduced 
in species with the duplication to that characteristic 
of those without it, to determine the basis for the 
near 2: 1 difference in PGI activities attributable 
to the duplicate genes, to learn why Pgi2 encodes 
a higher molecular weight unit than Pgi3 , and the 
nature of the mutations that eliminated PGI activity 
in the EMS-induced null mutants. 

Clarkia PGI Gene Sequences 

Headway on these questions can now be made 
because we have cloned and sequenced PGI genes 
from several Clarkia genomic libraries. Here I 
describe how these genes were obtained, evidence 
that they encode PGI, and their general structure. 
Detailed characterizations and sequences will be 
presented separately. To my knowledge, the Clark¬ 
ia PGI genes are the first nuclear genes from wild 
plants that have been sequenced. 

Our first genomic library was constructed with 
DNA isolated from seedlings of a horticultural strain 
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FIGURE 1. Restriction map and sequencing strategy for the Clarkia unguiculata 12 gene which encodes PGI. 
The gene is present on a 4.45-kb Bam/// fragment. Restriction sites shown are Bam/// (B), Hpa/ (//), Sph 
(Sp) , Pst/ (P), Sal/ (SJ, Bg//// (Bg) y EcoR I (E) , Pvu// , Nco/ (TV), arcd Kpn/ (K). The arrows above 

and below the restriction map show the direction and extent of sequencing for the individual M13 subclones. 
3.8 kb including the entire coding region was sequenced on one strand , and 1.6 kb on the complementary strand. 


of C. unguiculata (Northrup King "Clarkia Double 
Mixed Colors”), a species with the PCI duplication. 
Horticultural material was used because very large 
amounts of seed could be purchased, permitting us 
to fine-tune our techniques prior to studying natural 
populations. The DNA was extracted by a proce¬ 
dure modified from Fischer & Coldberg (1982) 
that yielded a nuclear pellet that, after lysis, pro¬ 
vided high molecular weight DNA fragments 
(greater than 100 kb). The DNA was partially 
restricted with Sailed and fragments between 15 
and 23 kb obtained by fractionation on a sucrose 
gradient. After determining optimal ratios of chro¬ 
mosomal DNA and vector arms, the DNA frag¬ 
ments were cloned into the Bam HI site of the 
lambda replacement vector Charon 35. The re¬ 
sulting library is estimated to contain 1.8 x 10° 
phage with 88% recombinants and represents about 
seven Clarkia genomes. 

The library was screened at low stringency (51°C, 
5 x SSPE) with an 800-bp DNA fragment of a 
yeast gene encoding PCI, kindly provided by a 
biotechnology firm. Sinc e we expected low to very 
low homology between the yeast and Clarkia PCI 

Tabu: 2. Homology between predicted amino acid 
sequences from nucleotide sequences of U2 and 18 , 
cloned from a genomic library q/ Clarkia unguiculata 
and amino acid sequences of cyanogen-bromide pep¬ 
tides purified from pig muscle PGI (Achari et al. % l ( )8l) . 

Number 
Identical 
Amino Acids 


Total Number 

of Amino Homol- 

Sequence Acids ogy 


U2 vs. U8 

319/548 

58% 

U2 vs. Pig 

110/166 

66% 

U8 vs. Pig 

108/166 

65% 

U2 vs. U8 (in sequences 



covered by Pig peptides) 

89/165 

54% 


sequences, the screening conditions were deter¬ 
mined in a prior experiment in which the probe 
was hybridized on a Southern blot to genomic C. 
unguiculata DNA digested with several restriction 
enzymes. Two positive clones were obtained from 
the first 30,000 plaques examined. They were pu¬ 
rified, and DNA prepared from each was restricted 
with several enzymes, subjected to agarose gel elec¬ 
trophoresis, and analyzed by Southern blots using 
the yeast PCI DNA fragment as probe. The two 
clones had inserts of 13.7 and 15 kb, which proved 
different. Hybridizing fragments of the former clone, 
designated 112, were cloned into M13mpl0 and 
partially sequenced. The sequences showed ho¬ 
mology to that of the yeast gene. A 4.45-kb Bam 111 
fragment (f ig. 1) was then subcloned into pUC19 
and deletion fragments constructed using the ex¬ 
onuclease III-S1 protocol of Henikoff (198 4). One 
strand of 3.8 kb including the entire coding region 
was completely sequenced, and 1.() kb was se¬ 
quenced on the complementary strand by the di- 
deoxy sequencing protocol (Messing, 1983). The 
112 sequence revealed an uninterrupted open read¬ 
ing frame of 1,644 nucleotides encoding a protein 
of 548 amino acids. 

The identity of U2 was established by comparing 
its predicted amino acid sequence with the amino 
acid sequences of five cyanogen-bromide peptides 
obtained from pig muscle PCI (Achari et al., 1 981). 
These are the only PCI sequences, protein or DNA, 
that are published for any organism. The five pig 
peptides identify a total of 166 amino acids, about 
30% of the complete protein. The 112 gene encodes 
amino acids that are identical to those in pig PGI 
at 1 10 of these 166 residues, or 66% of the total 
( Fable 2). A second PGI gene, called 1)8, also 
obtained from the C. unguiculata genomic libra ry, 
using U2 as the probe, was found that contains 
the same sequence present in the 15-kb insert 
noted above. A similar isolation and sequencing 
strategy was used to characterize the 118 clone as 
was used for 112. U8 proved to have a 65% ho- 
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FIGURE 2. Comparison of the predicted amino acid sequences encoded by Clarkia unguiculata 12 and U8 
genes with the amino acid sequences from five cyanogen-bromide peptides from pig muscle PCI (Achari et al. y 
1981). An open box drawn across the black bars indicates the same amino acid appears in the corresponding 
position on two or three of the sequences. The amino acids are numbered on the right beginning with the first 
methionine in the 12 sequence. L2 encodes 548 amino acids , U8 encodes 543 amino acids , and the total number 
of amino acids identified in the pig peptides is 166. The diagram represents the best fit by eye , taking into 
account several short insertions arid deletions in the sequences. 


mology to pig PCI (Table 2), and encodes a protein 
of 543 amino acids. 

The predicted amino acid sequences show that 
U2 and U8 are 58% homologous over their entire 
coding regions. Comparing U2 and U8 only in the 
regions covered by the pig peptides, the two se¬ 


quences are 54% identical (Table 2). Thus, the 
two Clarkia PGI genes differ more from each other 
than either does from pig PGI. The homology of 
the Clarkia and pig sequences is diagrammed in 
Figure 2. The two Clarkia proteins exhibit large 
blocks of very high amino acid identity as well as 
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many shorter regions of nonidentity. Several lengthy lings and this commercial strain is highly poly- 
portions of the three sequences show complete iden- morphic. 


tity. Overall, the high homology between the pig 


However, a different procedure suggests that 


PC I amino acid sequences and the predicted Clark- 18 encodes the slowly migrating allozyme PGI- 
ia amino acid sequences establishes with certainty 3B, a cytosolic isozyme. The 118 sequence was 
that both Clarkia genes encode PCI. inserted in pUC18, downstream from the beta- 

On the basis of lack of interruption in their open galactosidase promoter. When the operon was in¬ 
reading frames and the lengths ol their sequences, duced by IP1G, the h. coli host synthesized very 

which encode proteins that have closely similar large quantities ol PCI protein. The PCI was cat- 

molecular weights to that previously determined alytically active and had a very slightly taster elec¬ 
tor Clarkia PCI. neither Clarkia gene appears to trophoretic mobility on starch gels than the slow 

include introns (and see below). Otherise, both genes allozyme PGI-3B of C. unguiculata, a ditlerence 

have many features expected ot eukaryotic genes, probably caused by different post-translational pro- 

including potential TATA boxes and other up- tein modification between Clarkia and E. coli. By 

stream regions similar to known regulatory se- the same procedure, a large quantity of protein 

quences. A complete transcriptional characteriza- with the molecular weight of PCI was also synthe- 


tion of the genes will be reported separately. 


sized from U2, and its electrophoretic mobility was 


Clarkia unguiculata possesses the PCI dupli- similar to that of Clarkia plastid PGI. I he expres- 

cation, and its genome must include two loci en- sion of these genomic clones in E. coli , apparently 

coding cytosolic PGIs and one locus encoding plas- hv virtue of fortuitous promoters in their V non- 

tid PGI. Since a heterologous probe was used to coding region, provides convincing evidence that 

obtain the 12 and U8 PCI genes, it was necessary introns are not present in these genes. W hether 

to determine which isozyme is encoded by each other PGI genes also lack introns remains to be 

gene. A priori, the expectation was that sequences determined. I heir absence is surprising, since other 

encoding the cytosolic PGIs would be more similar genes encoding glycolytic enzymes in plants such 

to each other than either would be to the plastid as maize Tpi has eight introns (Marchionni & Gil- 

PGI. Genes encoding plastid and cytosolic glycolyt- bert, 1986) and maize Adh has nine introns (Dennis 

ic isozymes have been cloned and sequenced in et al., 1984). 

plants only for tobacco glyceraldehyde-3-phos- lo summarize the molecular studies, we have 

phate dehydrogenase (G3PD) (Shih et al., 1986), cloned and sequenced two PGI genes lrom a geno- 

and the results of that study are closely relevant mic library of C. unguiculata , a species with the 

to our research with PGI. Comparison of predicted PGI duplication. The genes have a hoim >logy of 

amino acid sequences from cDNAs showed that 58%; one of them (18) appears to encode a cy- 

the tobacco cytosolic G3PD was more similar to tosolic PGI-3 isozyme; the other is thought to en- 

■ * ' - * ^ ^ ■’ * i w •* r *, * * . * * r * i *** 

other eukaryotic G3PD enzymes, with about 65% code a plastid PGI. We have also constructed geno- 

homology, than it was to the tobacco plastid iso- mic libraries from Clarkia species without the PGI 

zyme, with 45%) homology. The homology of U2, duplication and have obtained clones ol a number 

118, and pig PGI are roughly similar to these values, of sequences homologous to the PGI probes from 

but we were able to compare only a few sequences. C. unguiculata. The molecular genetics studies of 

Our initial attempt to identify the isozymes en- PGI in Clarkia constitute one of the first analyses 

coded by the Clarkia genes centered on the search of the evolution of a plant nuclear gene. Many 

for correlation between restriction length fragments additional molecular studies are called lor to inl¬ 
and allelic segregation. This could be followed by derstand gene evolution and to improve phyloge- 

PGI activity staining on starch gels following elec- netic reconstructions, 

trophoresis of leaf extracts and correlated with the 
RFLP segregation. To date, we have examined a 
number of I)N As from single C. unguiculata plants 
by restriction analysis followed by electrophoresis 

«/ J * * 

and Southern blotting. The DNAs proved highly 
polymorphic, but we have been able to match re¬ 
striction fragments to 12. U8 and several other 
genes cloned from the C. unguiculata library have 
not vet been similarly matched, but this is not 
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